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The active site of 3CL proteinase (3CL*) for coronavirus was identified by comparing the crystal structures of 
human and porcine coronavirus. The inhibitor of the main protein of rhinovirus (Ag7088) could bind with 3CL”” of 
human coronavirus, then it was selected as the reference for molecular docking and database screening. The ligands 
from two databases were used to search potential lead structures with molecular docking. Several structures from 
natural products and ACD-SC databases were found to have lower binding free energy with 3CL”™ than that of 
Ag7088. These structures have similar hydrophobicity to Ag7088. They have complementary electrostatic potential 
and hydrogen bond acceptor and donor with 3CL””, showing that the strategy of anti-SARS dmg design based on 


molecular docking and database screening is feasible. 
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The first case of severe acute respiratory syndrome 
(SARS) was identified in November, 2002, in Guang- 
dong Province, China.’ In March, 2003, the putative 
cause of SARS was identified as a new coronavirus.” 
SARS-Cov is a member of the coronoviridae family of 
enveloped, positive-stranded RNA viruses, which have 
a broad host range. The length of genome sequence for 
coronaviruses is about 27—32 kb and it could encode 
23 putative proteins, including main proteinase (M™, 
also called 3CL”), nucleocapsid (N), spike (S), mem- 
brane (M), and small envelope (E). Because the viral 
main proteinase (3CL’”) controls the activities of the 
coronavirus replication complex, it is an attractive target 
for therapy and drug design.’ A large number of com- 
pounds were synthesized and separated, in order to find 
anti-SARS lead compounds. Virtual screening has the 
advantages of that searching lead structures is cheaper 
than the real experiment and the calculation could be 
performed on compounds that are not yet purchased or 
synthesized.° So virtual screening was used widely to 
find initial lead structures from large compound collec- 
tions. Therefore, some amounts of work were done to 
search the inhibitor of SARS.° 

In this study, by using molecular docking and other 
screening filters we have screened several types of da- 
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tabases, such as in-house natural product and ACD 
screening databases. Some interesting results are re- 
ported below. 


Methods 


Active site identification 


The crystal structure of main proteinase for human 
coronavirus was extracted from Brookhaven Protein 
Databank (PDB code: 1P9S). Hilgenfeld et al.* reported 
an inhibitor complex of porcine coronavirus and found 
that SARS coronavirus (SARS-Cov) main proteinase 
reveals a remarkable degree of conservation of the sub- 
strate-binding sites with porcine coronavirus. The PDB 
code of porcine coronavirus is 1LVO. We aligned these 
two structures with the routine in SYBYL6.9 of “align- 
ment homology” and found that the homologous ratio 
between two amino acid sequences is 63%. The region 
of binding site for 1LVO is the same as that of 1P9S. It 
shows that this region of the substrate binding sites is 
remarkably conserved.’ So we could identify the active 
site of human coronavirus. It is testified by means of the 
routine MOLCAD.’ 


Molecular docking 
Hilgenfeld et ai.* reported that rhinovirus 3CL”” in- 
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hibitors may be modified to make them useful for SARS 
therapy. Since Ag7088 (see Figure 1) has entered the 
clinical trials as the inhibitor of human rhinoviruses,® it 
is reasonable to be selected as the screening reference. 
Molecular dynamics was used to optimize its structure 
with Tripos force field.” Partial atom charges of 3CL”” 
were calculated with Kollman-all-atom’® approximation 
and Gasteiger-Hiickel for two types of inhibitors. 
Ag7088 was set in the cavity of binding site. All calcu- 
lations were performed on a workstation, SGI Origin 
300 with 32 CPUs. 


Figure 1 The structure of reference compound Ag7088. 


AutoDock3.0""” is a suitable software for perform- 
ing automated docking of ligands to their macromo- 
lecular protein receptors. The individual components of 
the program include AutoTors, AutoGrid, and Auto- 
Dock. AutoTors defines which bonds in the ligand are 
rotatable, affecting the degrees of freedom (DOF) of the 
ligand, and thus the complexity of the computations. 
AutoGrid pre-calculates a three-dimensional grid of 
interaction energies based on the macromolecular target 
using the AMBER force field. AutoDock can begin the 
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process of simulation. First, the ligand moves randomly 
in any one of six degrees of freedom (either translation 
or rotation) and the energy of the new ligand "state" is 
calculated. If the energy of the new state is lower than 
that of the old state, the new one is automatically ac- 
cepted as the next step in docking. During docking 
process, a maximum of 50 conformers was considered 
for each compound (default set is 10 conformers). 


Design of screening strategy 


Binding free energy is an important criterion for re- 
liable virtual screening. The investigated ligand located 
at the active site of 3CL™® is another necessary condi- 
tion for a screening result. On the other hand, hydro- 
phobic character is another important factor to drug de- 
sign. It reflects whether a drug molecule could reach the 
surface of protein. Usually it can be estimated by the 
ester/water distribution coefficient (log P). Therefore in 
this study, log P is used as another criterion for virtual 
screening. At the process of virtual screening, we also 
think about the standard of electrostatic potential and 
formation of hydrogen bond. All these features are used 
as the filters of the virtual screening. The flow chart of 
virtual screening is shown in Figure 2. 


Results and discussion 


The active site of main protein (3CL”™®) is shown in 


Figure 3, which contains s1 pocket, s2 pocket and a 
cavity of canal. Because molecular modeling suggests 
that available rhinovirus 3CL”” inhibitors may be modi- 
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fied to make them useful for treating SARS, we select 
Ag7088 as the reference of virtual screening.” The 
binding free energy of Ag7088 with 3CL’” is —63.42 
kJ/mol in the study. Their docking complex is illustrated 
in Figure 4. In the docking complex, there are hydro- 
phobic and electrostatic interactions between Ag7088 
and residues Asp186, Gin187, Prol88, Ser189 and one 
hydrogen bond with residue Gly167. 


Figure 3 The active site of 3CL””. 


Figure 4 The complex of Ag7088 and 3CL?”. 


In-house natural product database 


In this study, 1541 natural product structures, which 
are selected from an in-house database containing more 
than 25000 structures collected from recent literature 
and principally from TCM pharmaceutical components, 
have been prescreened. From these 25000 structures by 
using the filters we firstly selected the compounds of 
macro lactones. Then they were aligned to Ag7088 and 
the binding free energies with 3CL’™ were calculated. 
We considered that their binding free energies for these 
43 structures are less than —63.42 kJ/mol as the poten- 
tial lead structures. The screening results are presented 
in Figure 5. The ratio between potential lead structures 
and the total screened structures is 2.79%. This shows 
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that Ag7088 is a good reference structure for screening 
and our screening strategy is feasible. 
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Figure 5 Histogram of virtual screening (1: >462 kI/mol; 2: 
<462 and 2420 kJ/mol; 3: <420 and 2378 kJ/mol; 4: <378 
and 2336 kJ/mol; 5: <336 and 2294 kJ/mol; 6: <<294 and 
2252 kJ/mol; 7: <252 and 2210 kJ/mol; 8: <210 and 2168 
kJ/mol; 9: <168 and 2126 kJ/mol; 10: <126 and 284 
kJ/mol; 11: <84 and 242 kJ/mol; 12: <42 and 20 kJ/mol; 
13: <OQand 2—21 kJ/mol; 14: <—21 and 2—42kJ/mol; 15: 
<—42 and 2—63.42 kJ/mol; 16: <—63.42 kJ/mol). 


Among 43 potential lead structures only 6 structures 
bind with 3CL° receptor at active site. Their binding 
free energies, log P, the size of ring and comments are 
gathered in Table 1. Because the influence of hydropho- 
bicity is important, log P, values of structures N3 and 
N5 approach to that of Ag7088. The ring of those struc- 
tures is from 18 to 19 membered cycle in size. It sug- 
gests that suitable large ring could locate at the active 
site of receptor. Figure 6 is an example of docking com- 
plex for structure N1. Structure N1 locates at the active 
site of 3CL'°. The electrostatic potential of NI and 
3CL?® is complementary to each other. This could in- 
crease the binding affinity between ligands and the re- 
ceptor and the interaction is favorable to bioactivity. 
There are one hydrogen bond between oxygen atom (O 
=C) of N1 and the residue Glu165, and two hydrogen 
bonds between NH and residues Phel39 and His171. 


Table 1 Results of virtual screening for in-house database 


a 5 nee ) log P Size of ring Comment” 
Ag7088 — 63.42 3.13 One hydrogen bond 
N1 —79.97 1.639 19 t+4+t+t+ 
N2 — 82.57 0.608 13 44+ 
N3 —365.78 3.614 18 tet 
N4 —78.04 1.639 19 ttt 
N5 — 588.80 4.079 18 eH 
N6 — 258.59 5.158 19 rorrerers 


++++ good; +++++ very good. 


Coronavirus 


For N3, two oxygen atoms of carbonyl are linked by 
hydrogen bond to residue Glul65. From investigation 
on the complex of N4-3CL”®, there are four hydrogen 
bonds between the ligand and residues Glu165, Gin187, 
Phe139 and His171. There is also one hydrogen bond 
between N§5 and the residue Gln191, N6 and the residue 
His163, respectively. These hydrogen bonds seem to be 
favourable to the activity. 


Figure 6 The complex of N1 and 3CL?°. (A) N1 in active site; 
(B) the electrostatic potential surface of N1 and 3CL””. 


ACD screening database 


After the docking of the in-house database into the 
3CL°” was finished, the screening for ACD-SC data- 
base was then performed. 16000 compounds were se- 
lected under the limitation of molecular weight and 
other conditions. Then docking research was done. The 
results of the virtual screening are organized in Table 2. 
Their complexes are shown in Figure 7. They could 
bind with 3CL”° receptor tightly. Structures Al1—A3 
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include the functional group of sulphone and their mo- 
lecular volumes are less than those of natural products. 
Structure A4 has the antiviral activity and has the same 
scaffold with Ag7088.’> The flexibility of an active site 
might make different sizes of molecule located at this 
position. 


pro 


Figure 7 The complex of ligand and 3CL””. 


886 Chin. J. Chem., 2004, Vol. 22, No. 8 CHEN et al. 
Table 2 Result of virtual screening for ACD screening database 
No. Structure Binding free energy/(kJ*mol ') log P Comment? 
ll 
N 
Al ps —86.31 1.52 tet 
S NY 
| J 0 
A2 0=S=0 —72.79 2.91 ttt++ 
FLO 
rr 
F 
N 
A3 ml 0 —78.92 3.42 +H 
N \, 
o7 
(0) 
—N 
f (0) 
Ad Sayin nA ae —63.29 0.10 +++ 


°++++ good; +++++ very good. 


Conclusion 


We have presented a novel approach based on the 
molecular docking and database screening to search for 
inhibitors of SARS-Cov. The first step is to identify the 
active site of 3CL’™ by comparing the crystal structure 
of human and porcine coronavirus. Since Ag7088 could 
inhibit the main protein of rhinovirus, it was selected as 
the screening reference. Before docking research, pre- 
screening database of compounds was built. Then, these 
compounds were screened in the putative pocket. 
Known antiviral inhibitors like A4 could be screened 
within the best-scoring list. This shows that our screen- 
ing strategy is feasible. 

The binding free energy between Ag7088 and 3CL’” 
is —63.42 kJ/mol. Several structures from natural prod- 
ucts and ACD-SC databases are found to have lower 
binding free energy than that of Ag7088. These 
structures have similar hydrophobicity to Ag7088. Their 
electrostatic potential and hydrogen bond acceptor and 
donor are complementary with 3CL””. These structures 
are potential lead inhibitors to anti-SARS. The synthesis 
and bioactivity of these compounds will be done later. 
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